multilevel adventures
Often, there are opportunities to cluster your observations – repeated measures, group membership, hierarchies, even different measures for the same participant. Whenever you can cluster, you should!
Let’s return to our data from Tuesday.
data_path = "https://raw.githubusercontent.com/sjweston/uobayes/refs/heads/main/files/data/external_data/williams.csv"
d <- read.csv(data_path)
rethinking::precis(d) mean sd 5.5% 94.5% histogram
id 98.61029694 63.7493291 10.0000000 207.000000 ▇▇▇▃▅▅▅▃▃▃▂▂
female 0.57016803 0.4950710 0.0000000 1.000000 ▅▁▁▁▁▁▁▁▁▇
PA.std 0.01438236 1.0241384 -1.6812971 1.751466 ▁▁▃▇▇▃▁▁
day 44.17962096 27.6985612 6.0000000 92.000000 ▇▇▇▇▅▅▅▃▃▃
PA_lag 0.01992587 1.0183833 -1.7204351 1.717036 ▁▂▃▅▇▇▅▃▂▁
NA_lag -0.04575229 0.9833161 -0.8750718 1.990468 ▇▃▂▁▁▁▁▁▁▁▁▁▁▁
steps.pm 0.05424387 0.6298941 -1.0258068 1.011356 ▁▂▇▇▃▁▁▁
steps.pmd 0.02839160 0.7575395 -1.1235951 1.229974 ▁▁▇▇▁▁▁▁▁▁▁▁▁▁
NA.std -0.07545093 0.9495660 -1.0484293 1.811061 ▁▁▁▇▂▁▁▁▁▁
We fit an intercept-only MLM to these data to estimate positive affect.
\[\begin{align*} \text{PA.std}_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{\text{id}[i]} \\ \alpha_j &\sim \text{Normal}(\bar{\alpha}, \sigma_{\alpha}) \text{ for j in 1...239}\\ \bar{\alpha} &\sim \text{Normal}(0, 1.5)\\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma &\sim \text{Exponential}(1) \\ \end{align*}\]
m1 <- brm(
data=d,
family=gaussian,
PA.std ~ 1 + (1 | id),
prior = c( prior(normal(0, 1.5), class=Intercept),
prior(exponential(1), class=sd),
prior(exponential(1), class=sigma)),
iter=4000, warmup=1000, chains=4, cores=4, seed=9, #running this longer so it mixes
control = list(adapt_delta =.99),
file=here("files/models/m71.2")
)Let’s add time to our model. Because time varies (can change) within person from assessment to assessment, this is a Level 1 variable. Note that I have NOT added a varying component to Level 2 – in other words, I’m stating that the effect of time on positive affect is fixed or identical across participants.
\[\begin{align*} \text{Level 1} &\\ \text{PA.std}_{ij} &= \alpha_i + \beta_i(\text{day}_{ij}) + \epsilon_{ij} \\ \text{Level 2} &\\ \alpha_j &= \gamma_0 + U_{0j} \\ \beta_j &= \gamma_1 \\ \end{align*}\]
m2 <- brm(
data=d,
family=gaussian,
PA.std ~ 1 + day + (1 | id),
prior = c( prior(normal(.50, .25), class=Intercept),
prior(normal(0, 1), class=b), #new prior for new term
prior(exponential(1), class=sd),
prior(exponential(1), class=sigma)),
iter=2000, warmup=1000, chains=4, cores=4, seed=9,
file=here("files/models/m72.2")
) Family: gaussian
Links: mu = identity; sigma = identity
Formula: PA.std ~ 1 + day + (1 | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.80 0.04 0.73 0.88 1.04 110 285
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.13 0.05 0.02 0.23 1.08 47 188
day -0.00 0.00 -0.00 -0.00 1.00 4550 3272
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.60 0.00 0.59 0.60 1.00 6598 2554
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
There are different intercepts for each participant, but not different slopes.
[1] "b_Intercept" "b_day" "sd_id__Intercept"
[4] "sigma" "Intercept" "r_id[1,Intercept]"
[7] "r_id[2,Intercept]" "r_id[3,Intercept]" "r_id[4,Intercept]"
[10] "r_id[5,Intercept]" "r_id[6,Intercept]" "r_id[7,Intercept]"
[13] "r_id[8,Intercept]" "r_id[9,Intercept]" "r_id[10,Intercept]"
[16] "r_id[11,Intercept]" "r_id[12,Intercept]" "r_id[13,Intercept]"
[19] "r_id[14,Intercept]" "r_id[15,Intercept]" "r_id[16,Intercept]"
[22] "r_id[17,Intercept]" "r_id[20,Intercept]" "r_id[23,Intercept]"
[25] "r_id[24,Intercept]" "r_id[25,Intercept]" "r_id[26,Intercept]"
[28] "r_id[27,Intercept]" "r_id[28,Intercept]" "r_id[29,Intercept]"
[31] "r_id[30,Intercept]" "r_id[31,Intercept]" "r_id[32,Intercept]"
[34] "r_id[33,Intercept]" "r_id[34,Intercept]" "r_id[35,Intercept]"
[37] "r_id[36,Intercept]" "r_id[37,Intercept]" "r_id[38,Intercept]"
[40] "r_id[39,Intercept]" "r_id[40,Intercept]" "r_id[41,Intercept]"
[43] "r_id[42,Intercept]" "r_id[43,Intercept]" "r_id[44,Intercept]"
[46] "r_id[46,Intercept]" "r_id[47,Intercept]" "r_id[48,Intercept]"
[49] "r_id[49,Intercept]" "r_id[50,Intercept]" "r_id[51,Intercept]"
[52] "r_id[52,Intercept]" "r_id[53,Intercept]" "r_id[54,Intercept]"
[55] "r_id[55,Intercept]" "r_id[56,Intercept]" "r_id[58,Intercept]"
[58] "r_id[60,Intercept]" "r_id[64,Intercept]" "r_id[65,Intercept]"
[61] "r_id[67,Intercept]" "r_id[68,Intercept]" "r_id[69,Intercept]"
[64] "r_id[70,Intercept]" "r_id[72,Intercept]" "r_id[73,Intercept]"
[67] "r_id[75,Intercept]" "r_id[76,Intercept]" "r_id[80,Intercept]"
[70] "r_id[81,Intercept]" "r_id[82,Intercept]" "r_id[83,Intercept]"
[73] "r_id[85,Intercept]" "r_id[86,Intercept]" "r_id[88,Intercept]"
[76] "r_id[91,Intercept]" "r_id[92,Intercept]" "r_id[94,Intercept]"
[79] "r_id[95,Intercept]" "r_id[96,Intercept]" "r_id[97,Intercept]"
[82] "r_id[98,Intercept]" "r_id[99,Intercept]" "r_id[100,Intercept]"
[85] "r_id[101,Intercept]" "r_id[102,Intercept]" "r_id[104,Intercept]"
[88] "r_id[105,Intercept]" "r_id[106,Intercept]" "r_id[107,Intercept]"
[91] "r_id[109,Intercept]" "r_id[110,Intercept]" "r_id[111,Intercept]"
[94] "r_id[112,Intercept]" "r_id[113,Intercept]" "r_id[114,Intercept]"
[97] "r_id[115,Intercept]" "r_id[116,Intercept]" "r_id[117,Intercept]"
[100] "r_id[119,Intercept]" "r_id[120,Intercept]" "r_id[121,Intercept]"
[103] "r_id[123,Intercept]" "r_id[125,Intercept]" "r_id[127,Intercept]"
[106] "r_id[128,Intercept]" "r_id[129,Intercept]" "r_id[132,Intercept]"
[109] "r_id[133,Intercept]" "r_id[134,Intercept]" "r_id[135,Intercept]"
[112] "r_id[136,Intercept]" "r_id[137,Intercept]" "r_id[138,Intercept]"
[115] "r_id[139,Intercept]" "r_id[140,Intercept]" "r_id[142,Intercept]"
[118] "r_id[143,Intercept]" "r_id[144,Intercept]" "r_id[145,Intercept]"
[121] "r_id[147,Intercept]" "r_id[148,Intercept]" "r_id[149,Intercept]"
[124] "r_id[150,Intercept]" "r_id[151,Intercept]" "r_id[152,Intercept]"
[127] "r_id[153,Intercept]" "r_id[154,Intercept]" "r_id[155,Intercept]"
[130] "r_id[157,Intercept]" "r_id[158,Intercept]" "r_id[160,Intercept]"
[133] "r_id[162,Intercept]" "r_id[163,Intercept]" "r_id[164,Intercept]"
[136] "r_id[165,Intercept]" "r_id[166,Intercept]" "r_id[167,Intercept]"
[139] "r_id[168,Intercept]" "r_id[169,Intercept]" "r_id[170,Intercept]"
[142] "r_id[171,Intercept]" "r_id[173,Intercept]" "r_id[174,Intercept]"
[145] "r_id[176,Intercept]" "r_id[177,Intercept]" "r_id[178,Intercept]"
[148] "r_id[181,Intercept]" "r_id[182,Intercept]" "r_id[183,Intercept]"
[151] "r_id[184,Intercept]" "r_id[185,Intercept]" "r_id[186,Intercept]"
[154] "r_id[188,Intercept]" "r_id[189,Intercept]" "r_id[190,Intercept]"
[157] "r_id[191,Intercept]" "r_id[194,Intercept]" "r_id[195,Intercept]"
[160] "r_id[196,Intercept]" "r_id[197,Intercept]" "r_id[198,Intercept]"
[163] "r_id[199,Intercept]" "r_id[201,Intercept]" "r_id[202,Intercept]"
[166] "r_id[203,Intercept]" "r_id[204,Intercept]" "r_id[205,Intercept]"
[169] "r_id[206,Intercept]" "r_id[207,Intercept]" "r_id[208,Intercept]"
[172] "r_id[209,Intercept]" "r_id[211,Intercept]" "r_id[212,Intercept]"
[175] "r_id[213,Intercept]" "r_id[214,Intercept]" "r_id[216,Intercept]"
[178] "r_id[217,Intercept]" "r_id[219,Intercept]" "r_id[220,Intercept]"
[181] "r_id[221,Intercept]" "r_id[222,Intercept]" "r_id[223,Intercept]"
[184] "r_id[224,Intercept]" "r_id[225,Intercept]" "r_id[226,Intercept]"
[187] "r_id[227,Intercept]" "r_id[228,Intercept]" "r_id[229,Intercept]"
[190] "r_id[230,Intercept]" "r_id[231,Intercept]" "r_id[232,Intercept]"
[193] "r_id[233,Intercept]" "r_id[234,Intercept]" "r_id[236,Intercept]"
[196] "r_id[237,Intercept]" "r_id[238,Intercept]" "r_id[239,Intercept]"
[199] "lprior" "lp__" "accept_stat__"
[202] "stepsize__" "treedepth__" "n_leapfrog__"
[205] "divergent__" "energy__"
We can also add covariates at Level 2, or the person-level in this case. We have a binary variable called female that refers to the gender of each person1.
\[\begin{align*} \text{Level 1} &\\ \text{PA.std}_{ij} &= \alpha_i + \beta_i(\text{day}_{ij}) + \epsilon_{ij} \\ \text{Level 2} &\\ \alpha_j &= \gamma_{0\text{female}[j]} + U_{0j} \\ \beta_j &= \gamma_{1} \\ \end{align*}\]
d$female = ifelse(d$female == 1, "female", "male")
m3 <- brm(
data=d,
family=gaussian,
PA.std ~ 0 + day + female + (1 | id),
prior = c( prior(normal(0, 1), class=b),
prior(exponential(1), class=sd),
prior(exponential(1), class=sigma)),
iter=5000, warmup=1000, chains=4, cores=4, seed=9,
file=here("files/models/m72.3")
) Family: gaussian
Links: mu = identity; sigma = identity
Formula: PA.std ~ 0 + day + female + (1 | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.76 0.04 0.69 0.85 1.01 566 1542
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
day -0.00 0.00 -0.00 -0.00 1.00 18822 13488
femalefemale 0.35 0.08 0.19 0.49 1.01 320 688
femalemale -0.19 0.08 -0.35 -0.03 1.01 368 664
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.60 0.00 0.59 0.60 1.00 31230 10820
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
[1] "b_day" "b_femalefemale" "b_femalemale"
[4] "sd_id__Intercept" "sigma" "r_id[1,Intercept]"
[7] "r_id[2,Intercept]" "r_id[3,Intercept]" "r_id[4,Intercept]"
[10] "r_id[5,Intercept]" "r_id[6,Intercept]" "r_id[7,Intercept]"
[13] "r_id[8,Intercept]" "r_id[9,Intercept]" "r_id[10,Intercept]"
[16] "r_id[11,Intercept]" "r_id[12,Intercept]" "r_id[13,Intercept]"
[19] "r_id[14,Intercept]" "r_id[15,Intercept]" "r_id[16,Intercept]"
[22] "r_id[17,Intercept]" "r_id[20,Intercept]" "r_id[23,Intercept]"
[25] "r_id[24,Intercept]" "r_id[25,Intercept]" "r_id[26,Intercept]"
[28] "r_id[27,Intercept]" "r_id[28,Intercept]" "r_id[29,Intercept]"
[31] "r_id[30,Intercept]" "r_id[31,Intercept]" "r_id[32,Intercept]"
[34] "r_id[33,Intercept]" "r_id[34,Intercept]" "r_id[35,Intercept]"
[37] "r_id[36,Intercept]" "r_id[37,Intercept]" "r_id[38,Intercept]"
[40] "r_id[39,Intercept]" "r_id[40,Intercept]" "r_id[41,Intercept]"
[43] "r_id[42,Intercept]" "r_id[43,Intercept]" "r_id[44,Intercept]"
[46] "r_id[46,Intercept]" "r_id[47,Intercept]" "r_id[48,Intercept]"
[49] "r_id[49,Intercept]" "r_id[50,Intercept]" "r_id[51,Intercept]"
[52] "r_id[52,Intercept]" "r_id[53,Intercept]" "r_id[54,Intercept]"
[55] "r_id[55,Intercept]" "r_id[56,Intercept]" "r_id[58,Intercept]"
[58] "r_id[60,Intercept]" "r_id[64,Intercept]" "r_id[65,Intercept]"
[61] "r_id[67,Intercept]" "r_id[68,Intercept]" "r_id[69,Intercept]"
[64] "r_id[70,Intercept]" "r_id[72,Intercept]" "r_id[73,Intercept]"
[67] "r_id[75,Intercept]" "r_id[76,Intercept]" "r_id[80,Intercept]"
[70] "r_id[81,Intercept]" "r_id[82,Intercept]" "r_id[83,Intercept]"
[73] "r_id[85,Intercept]" "r_id[86,Intercept]" "r_id[88,Intercept]"
[76] "r_id[91,Intercept]" "r_id[92,Intercept]" "r_id[94,Intercept]"
[79] "r_id[95,Intercept]" "r_id[96,Intercept]" "r_id[97,Intercept]"
[82] "r_id[98,Intercept]" "r_id[99,Intercept]" "r_id[100,Intercept]"
[85] "r_id[101,Intercept]" "r_id[102,Intercept]" "r_id[104,Intercept]"
[88] "r_id[105,Intercept]" "r_id[106,Intercept]" "r_id[107,Intercept]"
[91] "r_id[109,Intercept]" "r_id[110,Intercept]" "r_id[111,Intercept]"
[94] "r_id[112,Intercept]" "r_id[113,Intercept]" "r_id[114,Intercept]"
[97] "r_id[115,Intercept]" "r_id[116,Intercept]" "r_id[117,Intercept]"
[100] "r_id[119,Intercept]" "r_id[120,Intercept]" "r_id[121,Intercept]"
[103] "r_id[123,Intercept]" "r_id[125,Intercept]" "r_id[127,Intercept]"
[106] "r_id[128,Intercept]" "r_id[129,Intercept]" "r_id[132,Intercept]"
[109] "r_id[133,Intercept]" "r_id[134,Intercept]" "r_id[135,Intercept]"
[112] "r_id[136,Intercept]" "r_id[137,Intercept]" "r_id[138,Intercept]"
[115] "r_id[139,Intercept]" "r_id[140,Intercept]" "r_id[142,Intercept]"
[118] "r_id[143,Intercept]" "r_id[144,Intercept]" "r_id[145,Intercept]"
[121] "r_id[147,Intercept]" "r_id[148,Intercept]" "r_id[149,Intercept]"
[124] "r_id[150,Intercept]" "r_id[151,Intercept]" "r_id[152,Intercept]"
[127] "r_id[153,Intercept]" "r_id[154,Intercept]" "r_id[155,Intercept]"
[130] "r_id[157,Intercept]" "r_id[158,Intercept]" "r_id[160,Intercept]"
[133] "r_id[162,Intercept]" "r_id[163,Intercept]" "r_id[164,Intercept]"
[136] "r_id[165,Intercept]" "r_id[166,Intercept]" "r_id[167,Intercept]"
[139] "r_id[168,Intercept]" "r_id[169,Intercept]" "r_id[170,Intercept]"
[142] "r_id[171,Intercept]" "r_id[173,Intercept]" "r_id[174,Intercept]"
[145] "r_id[176,Intercept]" "r_id[177,Intercept]" "r_id[178,Intercept]"
[148] "r_id[181,Intercept]" "r_id[182,Intercept]" "r_id[183,Intercept]"
[151] "r_id[184,Intercept]" "r_id[185,Intercept]" "r_id[186,Intercept]"
[154] "r_id[188,Intercept]" "r_id[189,Intercept]" "r_id[190,Intercept]"
[157] "r_id[191,Intercept]" "r_id[194,Intercept]" "r_id[195,Intercept]"
[160] "r_id[196,Intercept]" "r_id[197,Intercept]" "r_id[198,Intercept]"
[163] "r_id[199,Intercept]" "r_id[201,Intercept]" "r_id[202,Intercept]"
[166] "r_id[203,Intercept]" "r_id[204,Intercept]" "r_id[205,Intercept]"
[169] "r_id[206,Intercept]" "r_id[207,Intercept]" "r_id[208,Intercept]"
[172] "r_id[209,Intercept]" "r_id[211,Intercept]" "r_id[212,Intercept]"
[175] "r_id[213,Intercept]" "r_id[214,Intercept]" "r_id[216,Intercept]"
[178] "r_id[217,Intercept]" "r_id[219,Intercept]" "r_id[220,Intercept]"
[181] "r_id[221,Intercept]" "r_id[222,Intercept]" "r_id[223,Intercept]"
[184] "r_id[224,Intercept]" "r_id[225,Intercept]" "r_id[226,Intercept]"
[187] "r_id[227,Intercept]" "r_id[228,Intercept]" "r_id[229,Intercept]"
[190] "r_id[230,Intercept]" "r_id[231,Intercept]" "r_id[232,Intercept]"
[193] "r_id[233,Intercept]" "r_id[234,Intercept]" "r_id[236,Intercept]"
[196] "r_id[237,Intercept]" "r_id[238,Intercept]" "r_id[239,Intercept]"
[199] "lprior" "lp__" "accept_stat__"
[202] "stepsize__" "treedepth__" "n_leapfrog__"
[205] "divergent__" "energy__"
set.seed(9)
sample_id = sample(unique(d$id), replace=F, size = 20)
distinct(d, id, female, day) %>%
filter(id %in% sample_id) %>%
add_epred_draws(m3) %>%
mean_qi() %>%
ggplot( aes(x=day, y=.epred, group = as.factor(id), color=as.factor(female))) +
geom_line() +
scale_color_manual(values=c("#1c5253" , "#e07a5f")) +
labs(x="day",y="PA.std", color = "female") +
facet_wrap(~female) +
theme(legend.position = "top")McElreath doesn’t cover this in his video lecture, but this is from the textbook and worth discussing.
Data from Silk et al. (2005)
mean sd 5.5% 94.5% histogram
actor 4.0000000 2.0019871 1.000 7.000 ▇▇▁▇▁▇▁▇▁▇▁▇
recipient 5.0000000 2.0039801 2.000 8.000 ▇▇▁▇▁▇▁▇▁▇▁▇
condition 0.5000000 0.5004968 0.000 1.000 ▇▁▁▁▁▁▁▁▁▇
block 3.5000000 1.7095219 1.000 6.000 ▇▇▁▇▁▇▁▇▁▇
trial 36.5000000 20.8032533 4.665 68.335 ▇▇▇▇▇▇▇▁
prosoc_left 0.5000000 0.5004968 0.000 1.000 ▇▁▁▁▁▁▁▁▁▇
chose_prosoc 0.5674603 0.4959204 0.000 1.000 ▅▁▁▁▁▁▁▁▁▇
pulled_left 0.5793651 0.4941515 0.000 1.000 ▅▁▁▁▁▁▁▁▁▇
[1] 1 2 3 4 5 6 7
[1] 1 2 3 4 5 6
[1] 0 1
[1] 0 1
We could model the interaction between condition (presence/absence of another animal) and option (which side is prosocial), but it is more difficult to assign sensible priors to interaction effects. Another option, because we’re working with categorical variables, is to turn our 2x2 into one variable with 4 levels.
In this experiment, each pull is within a cluster of pulls belonging to an individual chimpanzee. But each pull is also within an experimental block, which represents a collection of observations that happened on the same day. So each observed pull belongs to both an actor (1 to 7) and a block (1 to 6). There may be unique intercepts for each actor as well as for each block.
Mathematical model:
\[\begin{align*} L_i &\sim \text{Binomial}(1, p_i) \\ \text{logit}(p_i) &= \alpha_{\text{ACTOR[i]}} + \gamma_{\text{BLOCK[i]}} + \beta_{\text{TREATMENT[i]}} \\ \beta &\sim \text{Normal}(0, 1.5)\\ \alpha_j &\sim \text{Normal}(0, \sigma_{\alpha}) \text{ , for }j=1..7\\ \gamma_k &\sim \text{Normal}(0, \sigma_{\gamma}) \text{ , for }k=1..6\\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\gamma} &\sim \text{Exponential}(1) \\ \end{align*}\]
m4 <-
brm(
family = bernoulli,
data = d2,
pulled_left ~ 0 + treatment + (1 | actor) + (1 | block),
prior = c(prior(normal(0, 1.5), class = b),
prior(exponential(1), class = sd, group = actor),
prior(exponential(1), class = sd, group = block)),
chains=4, cores=4, iter=2000, warmup=1000,
seed = 1,
file = here("files/models/m72.4")
) Family: bernoulli
Links: mu = logit
Formula: pulled_left ~ 0 + treatment + (1 | actor) + (1 | block)
Data: d2 (Number of observations: 504)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~actor (Number of levels: 7)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 1.96 0.63 1.06 3.45 1.00 1250 1660
~block (Number of levels: 6)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.20 0.17 0.01 0.63 1.00 1351 1690
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
treatment1 0.25 0.57 -0.90 1.38 1.00 1083 1577
treatment2 0.85 0.57 -0.31 1.94 1.00 1044 1743
treatment3 -0.16 0.57 -1.33 0.94 1.00 1062 1573
treatment4 0.72 0.57 -0.46 1.81 1.00 1022 1565
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Estimate Est.Error Q2.5 Q97.5
b_treatment1 0.253 0.569 -0.896 1.375
b_treatment2 0.854 0.572 -0.313 1.939
b_treatment3 -0.163 0.573 -1.331 0.944
b_treatment4 0.723 0.574 -0.457 1.812
sd_actor__Intercept 1.958 0.627 1.058 3.448
sd_block__Intercept 0.204 0.171 0.008 0.630
r_actor[1,Intercept] -0.768 0.587 -1.885 0.413
r_actor[2,Intercept] 4.169 1.280 2.187 7.250
r_actor[3,Intercept] -1.073 0.583 -2.165 0.118
r_actor[4,Intercept] -1.068 0.598 -2.199 0.166
r_actor[5,Intercept] -0.768 0.594 -1.902 0.478
r_actor[6,Intercept] 0.180 0.593 -0.972 1.330
r_actor[7,Intercept] 1.717 0.653 0.493 3.089
r_block[1,Intercept] -0.158 0.217 -0.689 0.138
r_block[2,Intercept] 0.037 0.182 -0.339 0.452
r_block[3,Intercept] 0.052 0.184 -0.278 0.499
r_block[4,Intercept] 0.015 0.182 -0.347 0.412
r_block[5,Intercept] -0.027 0.182 -0.430 0.353
r_block[6,Intercept] 0.106 0.196 -0.202 0.586
lprior -8.049 0.874 -10.255 -6.825
lp__ -288.904 3.866 -297.349 -282.508
Zooming in on just the actor and block effects. (Remember, these are differences from the weighted average.)
as_draws_df(m4) %>%
select(starts_with("sd")) %>%
pivot_longer(everything()) %>%
ggplot(aes(x = value, fill = name)) +
geom_density(linewidth = 0, alpha = 3/4, adjust = 2/3, show.legend = F) +
annotate(geom = "text", x = 0.67, y = 2, label = "block", color = "#5e8485") +
annotate(geom = "text", x = 2.725, y = 0.5, label = "actor", color = "#0f393a") +
scale_fill_manual(values = c("#0f393a", "#5e8485")) +
scale_y_continuous(NULL, breaks = NULL) +
ggtitle(expression(sigma["group"])) +
coord_cartesian(xlim = c(0, 4))Let’s look at the predictions by actor and block to confirm.
d2 %>% distinct(actor, block, treatment) %>%
add_epred_draws(m4) %>%
mutate(treatment=factor(treatment,
levels=as.character(1:4),
labels=t_labels)) %>%
group_by(actor, treatment) %>%
mean_qi(.epred) %>%
ggplot( aes(x=treatment, y=.epred, group=1) ) +
geom_point() +
geom_line() +
labs(x=NULL, y="p(pull left)", title="by actor") +
facet_wrap(~actor)d2 %>% distinct(actor, block, treatment) %>%
add_epred_draws(m4) %>%
mutate(treatment=factor(treatment,
levels=as.character(1:4),
labels=t_labels)) %>%
group_by(block, treatment) %>%
mean_qi(.epred) %>%
ggplot( aes(x=treatment, y=.epred, group=1) ) +
geom_point() +
geom_line() +
labs(x=NULL, y="p(pull left)", title="by block") +
facet_wrap(~block)Let’s start by simulating cafe data.
# ---- set population-level parameters -----
a <- 3.5 # average morning wait time
b <- (-1) # average difference afternoon wait time
sigma_a <- 1 # std dev in intercepts
sigma_b <- 0.5 # std dev in slopes
rho <- (-0.7) #correlation between intercepts and slopes
# ---- create vector of means ----
Mu <- c(a, b)
# ---- create matrix of variances and covariances ----
sigmas <- c(sigma_a,sigma_b) # standard deviations
Rho <- matrix( c(1,rho,rho,1) , nrow=2 ) # correlation matrix
# now matrix multiply to get covariance matrix
Sigma <- diag(sigmas) %*% Rho %*% diag(sigmas)
# ---- simulate intercepts and slopes -----
N_cafes = 20
library(MASS)
set.seed(5)
vary_effects <- mvrnorm( n=N_cafes, mu = Mu, Sigma=Sigma)
a_cafe <- vary_effects[, 1]
b_cafe <- vary_effects[, 2]
# ---- simulate observations -----
set.seed(22)
N_visits <- 10
afternoon <- rep(0:1,N_visits*N_cafes/2)
cafe_id <- rep( 1:N_cafes , each=N_visits )
mu <- a_cafe[cafe_id] + b_cafe[cafe_id]*afternoon
sigma <- 0.5 # std dev within cafes
wait <- rnorm( N_visits*N_cafes , mu , sigma )
d3 <- data.frame( cafe=cafe_id , afternoon=afternoon , wait=wait ) mean sd 5.5% 94.5% histogram
cafe 10.500000 5.7807513 2.000000 19.000000 ▇▇▇▇▇▇▇▇▇▇
afternoon 0.500000 0.5012547 0.000000 1.000000 ▇▁▁▁▁▁▁▁▁▇
wait 3.073405 1.1082252 1.477719 4.869957 ▁▁▂▅▇▇▅▇▃▃▁▁▁▁
cafe afternoon wait
1 1 0 3.967893
2 1 1 3.857198
3 1 0 4.727875
4 1 1 2.761013
5 1 0 4.119483
6 1 1 3.543652
7 1 0 4.190949
8 1 1 2.533223
9 1 0 4.124032
10 1 1 2.764887
In this exercise, we are simulating data from a generative process and then analyzing that data with a model that reflects exactly the correct structure of that process. But in the real world, we’re never so lucky. Instead we are always forced to analyze data with a model that is MISSPECIFIED: The true data-generating process is different than the model. Simulation can be used however to explore misspecification. Just simulate data from a process and then see how a number of models, none of which match exactly the data-generating process, perform. And always remember that Bayesian inference does not depend upon data-generating assumptions, such as the likelihood, being true. Non-Bayesian approaches may depend upon sampling distributions for their inferences, but this is not the case for a Bayesian model. In a Bayesian model, a likelihood is a prior for the data, and inference about parameters can be surprisingly insensitive to its details.
Mathematical model:
likelihood function and linear model
\[\begin{align*} W_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{CAFE[i]} + \beta_{CAFE[i]}(\text{Afternoon}_i) \end{align*}\]
varying intercepts and slopes
\[\begin{align*} \begin{bmatrix} \alpha_{CAFE[i]} \\ \beta_{CAFE[i]} \end{bmatrix} &\sim \text{MVNormal}( \begin{bmatrix} \alpha \\ \beta \end{bmatrix}, \mathbf{S}) \\ \mathbf{S} &\sim \begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix} \\ \end{align*}\]
priors
\[\begin{align*} \alpha &\sim \text{Normal}(5,2) \\ \beta &\sim \text{Normal}(-1,0.5) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
# examples
rlkj_1 = rethinking::rlkjcorr(1e4, K=2, eta=1)
rlkj_2 = rethinking::rlkjcorr(1e4, K=2, eta=2)
rlkj_4 = rethinking::rlkjcorr(1e4, K=2, eta=4)
rlkj_6 = rethinking::rlkjcorr(1e4, K=2, eta=6)
data.frame(rlkj_1= rlkj_1[,1,2],
rlkj_2= rlkj_2[,1,2],
rlkj_4= rlkj_4[,1,2],
rlkj_6= rlkj_6[,1,2]) %>%
ggplot() +
geom_density(aes(x=rlkj_1, color = "1"), alpha=.3) +
geom_density(aes(x=rlkj_2, color = "2"), alpha=.3) +
geom_density(aes(x=rlkj_4, color = "4"), alpha=.3) +
geom_density(aes(x=rlkj_6, color = "6"), alpha=.3) +
labs(x="R", color="eta") +
theme(legend.position = "top")m5 <- brm(
data = d3,
family = gaussian,
wait ~ 1 + afternoon + (1 + afternoon | cafe),
prior = c(
prior( normal(5,2), class=Intercept ),
prior( normal(-1, .5), class=b),
prior( exponential(1), class=sd),
prior( exponential(1), class=sigma),
prior( lkj(2), class=cor)
),
iter=2000, warmup=1000, chains=4, cores=4, seed=9,
file=here("files/models/m72.5")
) Estimate Est.Error Q2.5 Q97.5
b_Intercept 3.66 0.23 3.21 4.11
b_afternoon -1.13 0.14 -1.41 -0.85
sd_cafe__Intercept 0.97 0.17 0.71 1.37
sd_cafe__afternoon 0.59 0.12 0.39 0.87
cor_cafe__Intercept__afternoon -0.51 0.18 -0.80 -0.10
sigma 0.47 0.03 0.42 0.53
Intercept 3.10 0.20 2.71 3.50
r_cafe[1,Intercept] 0.55 0.29 -0.01 1.14
r_cafe[2,Intercept] -1.50 0.30 -2.10 -0.90
r_cafe[3,Intercept] 0.70 0.30 0.12 1.30
r_cafe[4,Intercept] -0.42 0.29 -0.98 0.15
r_cafe[5,Intercept] -1.79 0.30 -2.36 -1.20
r_cafe[6,Intercept] 0.60 0.29 0.02 1.18
r_cafe[7,Intercept] -0.05 0.29 -0.64 0.55
r_cafe[8,Intercept] 0.28 0.30 -0.31 0.86
r_cafe[9,Intercept] 0.32 0.29 -0.28 0.90
r_cafe[10,Intercept] -0.10 0.29 -0.68 0.49
r_cafe[11,Intercept] -1.74 0.29 -2.31 -1.17
r_cafe[12,Intercept] 0.18 0.29 -0.40 0.76
r_cafe[13,Intercept] 0.22 0.30 -0.37 0.82
r_cafe[14,Intercept] -0.49 0.30 -1.08 0.09
r_cafe[15,Intercept] 0.79 0.30 0.22 1.39
r_cafe[16,Intercept] -0.27 0.29 -0.86 0.32
r_cafe[17,Intercept] 0.55 0.30 -0.03 1.15
r_cafe[18,Intercept] 2.08 0.30 1.52 2.68
r_cafe[19,Intercept] -0.42 0.30 -1.00 0.16
r_cafe[20,Intercept] 0.07 0.30 -0.53 0.65
r_cafe[1,afternoon] -0.03 0.29 -0.60 0.53
r_cafe[2,afternoon] 0.23 0.30 -0.36 0.79
r_cafe[3,afternoon] -0.80 0.30 -1.39 -0.24
r_cafe[4,afternoon] -0.10 0.28 -0.65 0.43
r_cafe[5,afternoon] 1.00 0.30 0.42 1.57
r_cafe[6,afternoon] -0.16 0.28 -0.72 0.38
r_cafe[7,afternoon] 0.10 0.28 -0.46 0.64
r_cafe[8,afternoon] -0.50 0.29 -1.09 0.07
r_cafe[9,afternoon] -0.17 0.28 -0.73 0.38
r_cafe[10,afternoon] 0.18 0.29 -0.39 0.75
r_cafe[11,afternoon] 0.70 0.29 0.15 1.27
r_cafe[12,afternoon] -0.06 0.29 -0.63 0.50
r_cafe[13,afternoon] -0.68 0.29 -1.25 -0.12
r_cafe[14,afternoon] 0.19 0.29 -0.36 0.78
r_cafe[15,afternoon] -1.06 0.31 -1.65 -0.44
r_cafe[16,afternoon] 0.09 0.28 -0.47 0.64
r_cafe[17,afternoon] -0.09 0.28 -0.64 0.47
r_cafe[18,afternoon] 0.10 0.30 -0.49 0.70
r_cafe[19,afternoon] 0.87 0.30 0.29 1.47
r_cafe[20,afternoon] 0.07 0.29 -0.49 0.66
lprior -5.06 0.44 -6.07 -4.36
lp__ -197.20 7.16 -212.07 -184.27
Let’s get the slopes and intercepts for each cafe.
cafe_params = m5 %>% spread_draws(b_Intercept, b_afternoon, r_cafe[id, term]) %>%
pivot_wider(names_from = term, values_from = r_cafe) %>%
mutate( intercepts =b_Intercept + Intercept,
slopes = b_afternoon + afternoon) %>%
mean_qi(intercepts, slopes)
cafe_params %>%
dplyr::select(cafe=id, intercepts, slopes) %>%
round(2) cafe intercepts slopes
1 1 4.21 -1.16
2 2 2.16 -0.90
3 3 4.37 -1.93
4 4 3.24 -1.23
5 5 1.88 -0.14
6 6 4.26 -1.29
7 7 3.62 -1.03
8 8 3.94 -1.63
9 9 3.98 -1.31
10 10 3.56 -0.95
11 11 1.93 -0.43
12 12 3.84 -1.19
13 13 3.88 -1.81
14 14 3.18 -0.94
15 15 4.46 -2.19
16 16 3.39 -1.04
17 17 4.22 -1.22
18 18 5.75 -1.03
19 19 3.25 -0.26
20 20 3.73 -1.06
More about stat_ellipse here.
We can use the slopes and intercepts to calculate the expected morning and afternoon wait times for each cafe.
cafe_params %>%
mutate(
morning = intercepts,
afternoon = intercepts + slopes
) %>%
ggplot( aes(x=morning, y=afternoon) ) +
mapply(function(level) {
stat_ellipse(geom = "polygon", type = "norm",
linewidth = 0, alpha = .1, fill = "#1c5253",
level = level)
},
# enter the levels here
level = c(1:9 / 10, .99)) +
geom_point(size = 2)+
labs(x="morning wait time",
y="afternoon wait time")Or, we can use epred_draws.
d3 %>%
distinct(cafe, afternoon) %>%
add_epred_draws(m5) %>%
group_by(cafe, afternoon) %>%
summarise(wait = mean(.epred), .groups = "drop") %>%
mutate(afternoon =ifelse(afternoon==1, "afternoon", "morning")) %>%
pivot_wider(names_from = afternoon, values_from=wait) %>%
ggplot( aes(x=morning, y=afternoon) ) +
mapply(function(level) {
stat_ellipse(geom = "polygon", type = "norm",
linewidth = 0, alpha = .1, fill = "#1c5253",
level = level)
},
# enter the levels here
level = c(1:9 / 10, .99)) +
geom_point(size = 2)+
labs(x="morning wait time",
y="afternoon wait time")What is the correlation of our intercepts and slopes?
# A tibble: 1 × 6
cor_cafe__Intercept__afternoon .lower .upper .width .point .interval
<dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 -0.506 -0.799 -0.0956 0.95 mean qi
post = as_draws_df(m5)
rlkj_2 = rethinking::rlkjcorr(nrow(post), K=2, eta=2)
data.frame(prior= rlkj_2[,1,2],
posterior = post$cor_cafe__Intercept__afternoon) %>%
ggplot() +
geom_density(aes(x=prior, color = "prior"), alpha=.3) +
geom_density(aes(x=posterior, color = "posterior"), alpha=.3) +
scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
labs(x="R") +
theme(legend.position = "top")Let’s apply what we’ve learned to our affect data:
PA_lag is a within-person variable (Level 1), whereas female is a between-person variable (Level 2).
Mathematical model with varying slopes
Likelihood function and linear model
\[\begin{align*} \text{PA}_{ij} &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{\text{id}[i]} + \beta_{\text{id}[i]}\text{PA}_{i,t-1} \end{align*}\]
Varying intercepts and slopes:
\[\begin{align*} \alpha_{\text{id}[i]} &= \gamma_{\alpha} + U_{\alpha,\text{id}[i]} \\ \beta_{\text{id}[i]} &= \gamma_{\beta} + U_{\beta,\text{id}[i]} \end{align*}\]
Random effects:
\[\begin{align*} \begin{bmatrix} U_{\alpha,\text{id}[i]} \\ U_{\beta,\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}\left(\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \mathbf{S}\right) \\ \mathbf{S} &= \begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix} \end{align*}\]
Priors: \[\begin{align*} \gamma_\alpha &\sim \text{Normal}(0,1) \\ \gamma_\beta &\sim \text{Normal}(0,1) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
m6 <- brm(
data=d,
family=gaussian,
PA.std ~ 1 + PA_lag + (1 + PA_lag | id),
prior = c( prior( normal(0,1), class=Intercept ),
prior( normal(0,1), class=b ),
prior( exponential(1), class=sigma ),
prior( exponential(1), class=sd ),
prior( lkj(2), class=cor)),
iter=4000, warmup=1000, seed=9, cores=4, chains=4,
file=here("files/models/m72.6")
) Family: gaussian
Links: mu = identity; sigma = identity
Formula: PA.std ~ 1 + PA_lag + (1 + PA_lag | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
total post-warmup draws = 12000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(Intercept) 0.43 0.03 0.39 0.49 1.00 1514
sd(PA_lag) 0.11 0.01 0.09 0.14 1.00 3422
cor(Intercept,PA_lag) 0.08 0.11 -0.13 0.28 1.00 5402
Tail_ESS
sd(Intercept) 2817
sd(PA_lag) 6761
cor(Intercept,PA_lag) 7765
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.01 0.03 -0.08 0.05 1.00 605 893
PA_lag 0.44 0.01 0.42 0.47 1.00 5254 7769
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.54 0.00 0.54 0.55 1.00 17859 8363
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
What can we learn from this model?
# plot posterior density
p1 = post %>%
ggplot(aes(x = b_PA_lag)) +
stat_halfeye(fill="#1c5253") +
geom_vline( aes(xintercept=0), linetype = "dashed" ) +
labs( title="posterior density",
x="slope",
y=NULL) +
scale_y_continuous(breaks=NULL)
# posterior lines
p2 = d %>%
sample_n(2000) %>%
ggplot(aes( x=PA_lag, y=PA.std )) +
geom_point(alpha=.2, shape = 20) +
geom_abline( aes(intercept=b_Intercept, slope=b_PA_lag),
data=post[1:50, ],
alpha=.3,
color="#1c5253") +
labs( x="NA",
y="PA",
title="posterior lines")
p1 + p2 How much within-person variability is accounted for by variability in negative affect? Let’s compare estimates of the residual variability of these models.
# model with no predictors
m1_sigma = spread_draws(m1, sigma) %>% mutate(model = "intercept only")
m6_sigma = spread_draws(m6, sigma) %>% mutate(model = "with lag")
m1_sigma %>%
full_join(m6_sigma) %>%
group_by(model) %>%
mean_qi(sigma)# A tibble: 2 × 7
model sigma .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 intercept only 0.601 0.593 0.608 0.95 mean qi
2 with lag 0.543 0.536 0.550 0.95 mean qi
p1 = m1_sigma %>%
full_join(m6_sigma) %>%
ggplot( aes(y=sigma, fill=model )) +
stat_halfeye(alpha=.5) +
labs(
x=NULL,
y = "within-person sd",
title = "posterior distributions of\neach model") +
scale_x_continuous(breaks=NULL) +
theme(legend.position = "bottom")
p2 = data.frame(diff = m6_sigma$sigma - m1_sigma$sigma) %>%
ggplot( aes(x=diff)) +
stat_halfeye() +
labs(
y=NULL,
x = "m6-m1",
title = "posterior distribution of\ndifference in sigma") +
scale_y_continuous(breaks=NULL)
p2 + p1To what extent do people differ in the association of holdover PA?
To what extent do people differ in the association of holdover PA?
To what extent do people differ in the association of holdover PA?
set.seed(1)
sample_id = sample(unique(d$id), replace=F, size = 6)
d %>%
filter(id %in% sample_id) %>%
ggplot( aes( x=PA_lag, y=PA.std ) ) +
geom_point(alpha=.5, shape=20) +
geom_abline( aes(intercept=b_Intercept, slope=b_PA_lag),
data = post[1:50, ],
alpha=.4,
color="#1c5253") +
facet_wrap(~id) +
labs(y="today",
x="yesterday")m6 %>% spread_draws(b_Intercept, b_PA_lag, r_id[id, term]) %>%
pivot_wider(names_from = term, values_from = r_id) %>%
filter(id %in% sample_id)# A tibble: 72,000 × 8
# Groups: id [6]
.chain .iteration .draw b_Intercept b_PA_lag id Intercept PA_lag
<int> <int> <int> <dbl> <dbl> <int> <dbl> <dbl>
1 1 1 1 -0.0308 0.431 14 -0.0981 -0.0460
2 1 1 1 -0.0308 0.431 48 -0.0973 -0.0322
3 1 1 1 -0.0308 0.431 85 -0.347 -0.000398
4 1 1 1 -0.0308 0.431 163 -0.237 -0.133
5 1 1 1 -0.0308 0.431 204 -0.766 -0.0597
6 1 1 1 -0.0308 0.431 209 -0.270 -0.122
7 1 2 2 -0.0321 0.432 14 -0.0714 0.151
8 1 2 2 -0.0321 0.432 48 -0.150 -0.166
9 1 2 2 -0.0321 0.432 85 -0.164 0.108
10 1 2 2 -0.0321 0.432 163 -0.0101 0.176
# ℹ 71,990 more rows
post2 = m6 %>% spread_draws(b_Intercept, b_PA_lag, r_id[id, term]) %>%
pivot_wider(names_from = term, values_from = r_id) %>%
mutate(intercept = b_Intercept + Intercept,
slope = b_PA_lag + PA_lag) %>%
filter(id %in% sample_id) %>%
with_groups(id, sample_n, 50)
d %>%
filter(id %in% sample_id) %>%
ggplot( aes( x=PA_lag, y=PA.std ) ) +
geom_point(alpha=.5, shape=20) +
geom_abline( aes(intercept=intercept, slope=slope, color=as.factor(id)),
data = post2,
alpha=.4) +
facet_wrap(~id) +
guides(color="none") +
labs(y="today",
x="yesterday")post2 = m6 %>% spread_draws(b_Intercept, b_PA_lag, r_id[id, term]) %>%
pivot_wider(names_from = term, values_from = r_id) %>%
mutate(intercept = b_Intercept + Intercept,
slope = b_PA_lag + PA_lag) %>%
filter(id %in% sample_id) %>%
with_groups(id, sample_n, 50)
d %>%
filter(id %in% sample_id) %>%
ggplot( aes( x=PA_lag, y=PA.std ) ) +
geom_point(alpha=.5, shape=20) +
geom_abline( aes(intercept=intercept, slope=slope, color=as.factor(id)),
data = post2,
alpha=.4) +
geom_abline( aes(intercept=b_Intercept, slope=b_PA_lag),
data = post[1:50, ],
alpha=.4) +
facet_wrap(~id) +
guides(color="none") +
labs(y="today",
x="yesterday")Back to the question: how much do these slopes differ?
Is someone’s overall positive affect related to their holdover effect?
# A tibble: 2,316,000 × 6
# Groups: id [193]
id .chain .iteration .draw Intercept PA_lag
<int> <int> <int> <int> <dbl> <dbl>
1 1 1 1 1 0.0902 0.00725
2 1 1 2 2 0.0776 0.0551
3 1 1 3 3 0.122 -0.0274
4 1 1 4 4 0.0212 -0.00127
5 1 1 5 5 0.150 0.0538
6 1 1 6 6 -0.0100 -0.0403
7 1 1 7 7 0.0670 0.0744
8 1 1 8 8 0.0968 0.0747
9 1 1 9 9 0.0320 -0.0629
10 1 1 10 10 0.00455 0.132
# ℹ 2,315,990 more rows
m6 %>% spread_draws(r_id[id, term]) %>%
pivot_wider(names_from = term, values_from = r_id) %>%
ungroup() %>%
dplyr::select(.draw, Intercept, PA_lag) %>%
nest(data= -.draw) %>%
mutate(r = map_dbl(data, ~cor(.$Intercept, .$PA_lag))) # A tibble: 12,000 × 3
.draw data r
<int> <list> <dbl>
1 1 <tibble [193 × 2]> 0.162
2 2 <tibble [193 × 2]> 0.134
3 3 <tibble [193 × 2]> 0.000967
4 4 <tibble [193 × 2]> -0.0256
5 5 <tibble [193 × 2]> -0.00712
6 6 <tibble [193 × 2]> 0.0129
7 7 <tibble [193 × 2]> 0.0624
8 8 <tibble [193 × 2]> 0.00134
9 9 <tibble [193 × 2]> 0.0653
10 10 <tibble [193 × 2]> -0.0965
# ℹ 11,990 more rows
m6 %>% spread_draws(r_id[id, term]) %>%
pivot_wider(names_from = term, values_from = r_id) %>%
ungroup() %>%
dplyr::select(.draw, Intercept, PA_lag) %>%
nest(data= -.draw) %>%
mutate(r = map_dbl(data, ~cor(.$Intercept, .$PA_lag))) %>%
mean_qi(r)# A tibble: 1 × 6
r .lower .upper .width .point .interval
<dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0.0799 -0.0787 0.236 0.95 mean qi